This slide deck is available at: [GET] Chris fill in.
18 January 2017
This slide deck is available at: [GET] Chris fill in.
Thanks so much to Garrett Grolemund and RStudio for providing us with helpful suggestions and examples, some of which we've included here.
Thanks to the open source R community for making life way cooleR.
Complete the survey: https://goo.gl/forms/b0UuRnpxpfjwpiD93
Basic familiarity with R (e.g. can install packages, source files, and so on).
Familiarity with R Markdown would be a benefit. For DataFest tutorial, see here Chris
Have a laptop with the following software installed:
R and RStudio
rmarkdown, flexdashboard, and shiny packages for R
Working internet connection
When your output documents are in HTML, you can create interactive visualisations.
Potentially–though not always–more engaging and could let users explore data on their own.
Client Side: HTML documenst are rendered on the user's (client's) computer. Often JavaScript in the browser. You simply send them static HTML/JavaScript needed for their browser to create the plots. Could be sent from a services such as RPubs.
Server Side: Data manipulations and/or plots (e.g. with shinyapps) are done on a server in R. Browsers don't come with R built in.
Lets consumers explore the data on their own.
Are generated by reproduible code that anyone can look at.
Can be instantaneously updated to reflect new data.
You can use R Markdown to create HTML documents.
See http://rmarkdown.rstudio.com/ for an introduction to R Markdown.
Use three back-ticks (```) to start and end a code chunk that is not run.
To create a knit-able code chunk begin the chunk with ```{r}.
Close the chunk with another three tick (```).
Change how R Markdown chunks behave with options. Place options in the chunk head: ```{r echo=FALSE, error=FALSE}
| Option | What it Does |
|---|---|
echo=FALSE |
Does not print the code only the output |
error=FALSE |
Does not print errors |
include=FALSE |
Does not include the code or output, but does run the code |
fig.width |
Sets figure width |
cache=TRUE |
Cache the chunk. It is only run when the contents change. |
Many others at http://yihui.name/knitr/options
The key is to declare html_document in the header. E.g.:
output:
html_document
For example, we can summarise the survey data using: http://tinyurl.com/jx4gty2.
Which downloads your survey data from Google Sheets and creates . . .
Copy the code at http://tinyurl.com/jx4gty2 used to create the survey summary document.
Add an additional code chunk plotting responses to the question "Have you ever built a 'data dashboard'?"
Hosting is actually pretty easy. More soon on this!
The Open Source R community writes packages that enable users to do more but code less.
Almost everything today take a couple of lines of code.
You no longer have to be a programmer to use R.
Packages are rapidly expanding R's capabilities.
There are a growing number of R packages that make it easy to create interactive/web native visualisations.
Many of these are based on a framework called htmlwidgets.
library(leaflet); library(dplyr) leaflet() %>% addTiles() %>% fitBounds(0, 40, 10, 50)
library(networkD3); data(MisLinks); data(MisNodes)
forceNetwork(Links = MisLinks, Nodes = MisNodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",
Group = "group", opacity = 0.7, zoom = TRUE)
library(dygraphs)
dygraph(nhtemp, main = "New Haven Temperatures") %>%
dyRangeSelector(dateWindow = c("1920-01-01", "1960-01-01"))
library(DT) datatable(iris, options = list(pageLength = 5))
Dashboards provide an overview of key data.
The flexdashboard R package allows you to easily create dashboards with R Markdown.
Once you installed the flexdashboard package, in RStudio select File > New File > R Markdown.... Then, select Flex Dashboard from the From Template new R Markdown type:
Like other R Markdown documents, flexdashboards start with a YAML header. E.g.:
---
title: "DataFest 2017 | Intro to dynamic web documents"
author: "Christopher Gandrud & Dustin Tingley"
date: "18 January 2017"
output:
ioslides_presentation:
css: datafest_slides.css
logo: img/iqss_logo_flat.png
---
Dashboard rows are delimited by the 3rd level markdown header: ###.
Separate columns are delimited with Column followed by -------------------. Column widths can be set in the section heading with the data-width attribute.
You can create multi-page dashboards by placing the table label followed by ====================== after the material you want on the previous tab.
E.g.
Net Downloads ===========================================
Adding .tabset to a section heading can create tabs within a specific section.
We could make a dashboard from the hometowns you provided.
Here's an example (using fake data): http://tinyurl.com/hgk3fxx
You: make a similar dashboard using R source code available at: http://tinyurl.com/zwszswb
Hint: it's typically good practice to place setup code in a separate chunk just under the header, with the code chunk option include=FALSE.
Full code for the exercise can be found at: http://tinyurl.com/zwk5het.
There are lots of free services (e.g. RPubs, GitHub Pages) for hosting webpages for client-side rendering.
Create an RPubs account.
After knitting an R Markdown document to HTML, click on Publish > Publish Document… in the output viewer.
In the resulting pop-up box, click RPubs > Publish
For example: http://rpubs.com/christophergandrud/datafest_ex_dashboard.
Note: RPubs documents do not update when new data is available. It is a static file host.
Shiny apps allow you to create interactive apps that leverage R to conduct analyses in the browser and present results.
For example:
You can turn your flexdashboards into full data exploration apps using Flexdasboards + Shiny.
Chris I edited this slide to make it more accessible. Please review.
Add runtime: shiny to the YAML header options to declare the dashboard a shiny flexdashboard.
Add a {.sidebar} attribute to the first column in the dashboard. This is where we will place the controls that give us more control over the dashboard.
Add Shiny inputs (e.g., things that give us control over what data we look at) and outputs (e.g., ways to look at the data).
Source: http://rmarkdown.rstudio.com/flexdashboard/shiny.html
selectInput("n_breaks", label = "Number of bins:",
choices = c(10, 20, 35, 50), selected = 20)
sliderInput("bw_adjust", label = "Bandwidth adjustment:",
min = 0.2, max = 2, value = 1, step = 0.2)
Creates two elements in a new input object: n_breaks and bw_adjust.
input is passed to the output code in renderPlot:
renderPlot({
hist(faithful$eruptions, probability = TRUE,
breaks = as.numeric(input$n_breaks),
xlab = "Duration (minutes)",
main = "Geyser Eruption Duration")
dens <- density(faithful$eruptions, adjust = input$bw_adjust)
lines(dens, col = "blue")
})
Note that in previous examples, almost all of the code is exactly the same as what you would use in R anyways.
The only difference is the includsion of functions to declare app inputs and outputs.
Create a new shiny dashboard that subsets our Google Sheets survey data based on participants position (e.g. "undergraduate", "faculty") and returns a bar plot of the subset's R skill level.
Starter R code available at: Chris can you put a link to what would get people going, but not give them the answer? Then below they can look. That is better pedagogy for our format. dig?
Completed example R code available at: http://tinyurl.com/hwp5pgf
The Vice Provost for Advances in Learning- Research group as well as others at Harvard have invested in Dashboards that are dyanamic and interactive via Shiny.
Lets look at some examples.
These are more involved on the programming side, but still make heavy use of packages.
Further, they can serve as templates as it is common to share Shiny app code! https://shiny.rstudio.com/gallery/
Create a shinyapps.io account and follow sign up instructions.
After running your shiny app, click on Publish > Publish Document… in the output viewer.
In the resulting pop-up box connect your ShinyApps.io account following the onscreen instructions and publish.
If data cannot leave Harvard servers, then IQSS/VPAL maintain a Shiny Server
Key features include Harvard Key authentication and support for up to level 3 data security.
When working with data, need to use reproducible code
Need to make data transparent
Tools for ex ante (e.g., power analysis) / ex post investigations (e.g., outliers)
Promotion of work!
As an institution we generate tons of data.
Across the University much data is managed by old spreadsheet practices
And databases/reporting/visualization tools are separated from data science tools
We are not ok with that in research, should we revisit this in other places?
R easily integrates with modern databases (SQL etc.)
99.9% of what you saw today is free.
Our students are learning R in their classes
R is now one of the most sought after programming language/we are in the era of big data
They could help us transform our UniveRsity and Research practices.
What do you want to use Rmarkdown/dashboards for?
What data? How would you want to look at the data?
Everyone in teams contribute to a Google Document.